Immune Repertoire Sequencing: Methodologies and Experimental Design

The immune system realizes the specific recognition of massive antigens through the high diversity of T cell receptor (TCR) and B cell receptor (BCR), and IR-Seq (Immune Repertoire Sequencing) is the core technology to analyze this diversity. TCR-seq focuses on the T cell receptor gene sequence, which can reveal the dynamic changes in cellular immune response, such as cloning and amplification, subtype distribution, etc., and provide a key basis for the prognosis of tumor immunotherapy and immune monitoring of infectious diseases. BCR-seq is aimed at B cell receptors, which can capture antibody diversity and somatic high-frequency mutation characteristics in humoral immunity, and help vaccine research and development, and research on autoimmune diseases.

However, the reliability and application value of IR-Seq results highly depend on scientific methodology selection and rigorous experimental design —— from sample type selection and nucleic acid extraction strategy, to primer design, library construction mode, and then to the determination of sequencing platform and depth, each step needs to be accurately optimized in combination with research objectives (such as mutation analysis of BCR or clone tracking of TCR). Therefore, it is of great significance to systematically sort out the methodology system and experimental design points of IR-Seq to promote basic research and clinical transformation related to immunity.

This article systematically elaborates on immune repertoire sequencing, methods for determining optimal sequencing depth, and discusses current challenges and future directions to support related research and clinical translation.

Sample Procurement and Preparation

The sample is the basis of the immune repertoire sequencing experiment, and its acquisition quality and preparation process directly determine the reliability and accuracy of subsequent sequencing data, which should be carried out around three core links: sample type selection, timeliness of sample processing, and quality control of nucleic acid extraction.

Sample Type Selection

The samples of immune repertoire sequencing come from a wide range of sources, covering blood, tissue, pleural effusion, and other types.

  • Blood sample is the most commonly used sample type because it is easy to obtain and rich in immune cells, which can be used to study systemic immune response.
  • Tissue samples (such as tumor tissue and infected focus tissue) can reveal the characteristics of the local immune microenvironment, but attention should be paid to the differences in immune cell distribution that may be caused by tumor heterogeneity.
  • In addition, body fluid samples such as pleural effusion and cerebrospinal fluid have unique value in the study of specific diseases (such as lung infection and autoimmune diseases of the nervous system), which can reflect the immune state of the disease site.

When selecting sample types, it is necessary to weigh the accessibility of samples and the specificity of immune information according to the purpose of the study, and if necessary, adopt the strategy of joint analysis of multiple types of samples.

Timeliness of Sample Processing

The timeliness of sample processing is very important to the stability of the immune library. The receptors on the surface of immune cells (such as TCR/BCR) will change dynamically with time. If the cells are not treated in time after in vitro, the diversity information of receptors may be distorted due to metabolic disorder, apoptosis, or a change of activation state:

  • For blood samples, it is recommended to separate them within 2 hours after collection. If they cannot be processed in time, they should be stored at room temperature with a special preservation solution (such as Streck BCT tube) for a maximum of 72 hours.
  • Tissue samples should be precooled quickly after being isolated, and tissue dissociation or liquid nitrogen cryopreservation should be completed within 30 minutes. In particular, repeated freezing and thawing should be avoided because it will destroy the cell structure, cause nucleic acid fragmentation, and affect the quality of subsequent sequencing.

Block diagram illustrating the proposed methodology for identifying cell types in scRNA-seq data (Vasighizaker et al., 2022)Block diagram of the proposed approach for discovering cell types in scRNA-seq data (Vasighizaker et al., 2022)

Quality Control of Nucleic Acid Extraction

High-quality nucleic acid is the premise to ensure the accuracy of sequencing data. In the process of nucleic acid extraction, it is necessary to strictly control three key indices: purity, concentration, and integrity.

  • In terms of purity, the ideal ratio of A260/A280 of RNA samples should be between 1.8 and 2.2, and the ratio of A260/A230 should be greater than 2.0. If the ratio deviates from this range, it may be contaminated by protein, phenols, or salt ions and needs to be purified again.
  • The Qubit fluorescence quantifier is recommended for concentration determination, and its detection sensitivity and accuracy are significantly higher than those of a spectrophotometer.
  • In the integrity evaluation, the electrophoresis analysis was carried out by Agilent Bioanalyzer, and the RIN value (RNA Integrity Number) of RNA samples should be ≥7. If the samples were seriously degraded (RIN<5), the distribution of sequencing reads would be uneven, which would affect the accuracy of V(D)J gene assembly.
  • In addition, the extracted nucleic acid needs to be stored at -80℃ separately to avoid damage to nucleic acid caused by repeated freezing and thawing.

BCR vs. TCR Repertoire Sequencing

BCR and TCR are the core research objects of the immune library, and there are significant differences in molecular structure, immune function, and sequencing technology requirements between them, so it is necessary to design experimental schemes accordingly.

Differences in Structure and Diversity

  • BCR is composed of heavy chain (IGH) and light chain (IGK/IGL). Besides V (D) J recombination and N region insertion, the diversity of BCR also depends on somatic high-frequency mutation (SHM) and class switching recombination (CSR) during B cell maturation.
  • TCR is composed of α chain (TRA) and β chain (TRB) or γ chain (TRG) and δ chain (TRD). The diversity mainly comes from V (D) J recombination and N region insertion, and there is no SHM.
  • This difference leads to the fact that BCR library sequencing should focus on SHM locus detection, while TCR library sequencing should focus on V (D) J recombination pattern analysis.

Immune Function Difference

  • BCR mainly mediates humoral immunity and binds antigen by secreting antibodies. Its library sequencing is often used for antibody response analysis after infection, research on autoantibody-related diseases (such as rheumatoid arthritis), and vaccine effect evaluation.
  • TCR mainly mediates cellular immunity and recognizes antigen peptide-MHC complex presented by antigen-presenting cells. Its library sequencing is mostly used to predict the efficacy of tumor immunotherapy (such as TCR clone amplification analysis in PD-1 inhibitor treatment), monitor transplant rejection, and study T cell-mediated infections (such as tuberculosis).

Difference in Sequencing Requirements

  • BCR needs more accurate sequence analysis because of the existence of SHM (such as long reading and long sequencing to capture the complete heavy chain variable region and clarify the distribution of SHM sites).
  • TCR needs higher sequencing throughput to cover more clone types due to the diversity of α/β chain combinations, and short reads and long Qualcomm platform (such as Illumina) is more suitable.
  • In addition, the existence of BCR light chain (IGK/IGL) increases the complexity of the definition of clonality, and it is necessary to analyze the heavy chain and light chain sequences at the same time to accurately distinguish clones, while TCR usually defines clonality through α+β chain or γ+δ chain combination.

The structure and generation of TCR (Aran et al., 2022)TCR structure and generation (Aran et al., 2022)

Primer Design Strategies for Immune Repertoire Sequencing

Primer design is the key to capturing TCR/BCR variable region sequences specifically by Immune Repertoire Sequencing, which needs to take into account three principles: coverage integrity, amplification specificity, and platform compatibility. The core strategy revolves around target region targeting and primer type selection.

A. Primer design based on the target region

a) The core goal of immune repertoire sequencing is to capture the variable region (especially the CDR3 region) after V (D) J recombination. Common design directions include:

  • A forward primer was designed for the conserved sequence in the V region, and a reverse primer was designed in combination with the conserved sequence in the J region, which can specifically amplify the V-J (or V-D-J) fragment containing the CDR3 region.
  • The forward primer is designed for the untranslated region (UTR) at the 5' end, and the reverse primer is designed in combination with the conserved sequence in the constant region (C region), which is suitable for the scene where the variable region and the constant region need to be analyzed at the same time (such as BCR class conversion analysis).
  • It is necessary to refer to the authoritative database (such as the IMGT database) to obtain the conserved sequences of V, J, and C regions of TCR/BCR of each species to ensure that all known subtypes are covered by primers.

How diversity measures vary with read-length (Bashford-Rogers et al., 2014)Variation of diversity measures with read-length (Bashford-Rogers et al., 2014)

B. Design of multiplex PCR primers and a single primer pool

a) Because there are many subtypes of V and J gene fragments in TCR/BCR (for example, there are 52 V subtypes and 13 J subtypes in human TRB gene), it is necessary to adopt multiple PCR primer design-mixing forward primers covering different V subtypes into a "V primer pool" and reverse primers covering different J subtypes into a "J primer pool", and realizing synchronous amplification of multi-subtype fragments through one PCR.

b) In the design, it is necessary to optimize the concentration ratio of primers to avoid partial subtype amplification bias caused by primer competition, and at the same time, control the primer length difference (usually 18-25 bp) to ensure the same amplification efficiency.

C. Primer modification and platform adaptation

a) Primer modification is a key step to achieve high-quality output of sequencing data, which needs to take into account platform compatibility and amplification efficiency. To adapt to different sequencing platforms, it is necessary to add a platform-specific linker sequence at the 5' end of the primer:

  • P5/P7 linker needs to be added to the Illumina platform. As the binding site of bridge PCR, this sequence determines whether the subsequent sequencing reaction can proceed smoothly.
  • For the Ion Torrent platform, primer sequences A and P1 need to be added to adapt to the ion current detection principle of semiconductor sequencing technology.
  • In addition, the introduction of barcode can realize multi-sample mixed sequencing, which can not only significantly reduce the experimental cost, but also improve the sequencing throughput. The design of the barcode should follow the principles of balanced base distribution and no homology to avoid label crosstalk between samples.

In the process of primer design, the secondary structure of primers and the interaction between primers have significant effects on amplification specificity. Hairpin structure may lead to the primer not being effectively combined with the template chain, and primer dimer will consume the enzyme and dNTPs in the reaction system, which will reduce the amplification efficiency of the target sequence. Therefore, it is necessary to accurately predict and optimize the melting temperature (Tm), GC content, free energy, and other parameters of primers with the help of professional software such as OligoCalc and Primer3.

Immune Repertoire Sequencing Library Preparation Strategies

Library construction is a key step to connect nucleic acid samples with the sequencing platform. Appropriate strategies should be selected according to sample type (DNA/RNA), sequencing target (full-length/partial variable region), and platform characteristics. The core includes template type selection, amplification, linker connection mode, and quality control.

Analysis of the BCR repertoire sequence (Xu et al., 2022)BCR repertoire sequence analysis (Xu et al., 2022)

Selection Strategy based on DNA and RNA

  • DNA library: Using genomic DNA as a template, the V (D) J recombinant fragment is amplified by PCR, which is suitable for analyzing the genome-level characteristics of immune cell clones, regardless of gene expression level, but it is impossible to distinguish functional and non-functional receptors.
  • RNA library: Using total RNA or mRNA as a template, cDNA is synthesized by reverse transcription and then amplified by PCR, which is suitable for analyzing the expression level of receptors, and functional receptors can be enriched by targeting constant regions, but it is greatly affected by RNA degradation, so the sample quality should be strictly controlled.

Amplification and Linker Connection Mode

  • One-step PCR: V (D) J fragment amplification and linker addition (pre-linker at the 5' end of primer) are completed simultaneously in a single PCR reaction, which is simple and time-consuming, but requires high primer specificity and is prone to nonspecific amplification.
  • Two-step PCR: in the first step, only the V (D) J fragment is amplified, and in the second step, connectors and barcode are added to both ends of the amplified product. The amplification specificity (such as adjusting annealing temperature) can be optimized by the first step, which reduces the proportion of nonspecific products and is more suitable for complex samples.

Library Quality Control

The size of the library fragment (IR-seq library is usually 200-500 bp, which needs to match the target amplification fragment) is detected by Agilent Bioanalyzer, and the library concentration is detected by Qubit or digital PCR to ensure that the library has no obvious primer dimer (< 100 bp) and the concentration meets the requirements of sequencing platform (for example, the concentration of single sample library on Illumina platform should be ≥10 nM). At the same time, the V/J gene coverage integrity of the library can be verified by predictive sequencing (small-pass sequencing) to avoid data bias caused by library construction deviation.

Determining Optimal Sequencing Depth

The sequencing depth directly affects the integrity (such as the detection rate of clones) and accuracy (such as the reliable identification of low-abundance clones) of immune repertoire sequencing data, which needs to be comprehensively determined according to the research objectives, sample types, and the complexity of the immune library. The core principle is to cover the target clone types and avoid excessive sequencing waste.

Factors Affecting Sequencing Depth

  • Complexity of immune library: The immune library of healthy human PBMCs has high diversity (the number of TCR clones can reach 10 6-10 7) and needs higher sequencing depth (for example, 5-10 million reading segments per sample); However, there may be oligoclonal amplification of TILs in tumor tissue, which has low diversity and the sequencing depth can be appropriately reduced (for example, 2 million to 5 million reading segments).
  • Research objectives: If it is necessary to detect rare clones with low abundance (such as tumor minimal residual disease monitoring), it is necessary to improve the sequencing depth (such as 10-20 million reading segments); If only the composition and diversity index of dominant clones are analyzed, the sequencing depth can be reduced (for example, 3-5 million reading segments).
  • Sample type: Single cell IR-seq. Because each cell contains only 1-2 TCR/BCR clones, the sequencing depth needs to match the number of cells (for example, 10,000 cells correspond to 500,000-1 million reading segments) to ensure that the receptor sequence of each cell is effectively captured.

Evaluation and Optimization Method

Through the analysis of the saturation curve, the optimal depth is determined. Gradient sequencing is carried out on the same library, and a curve of sequencing reading number-detected clone number is drawn. When the curve tends to be flat, the corresponding reading number is the saturated sequencing depth, which is also the optimal sequencing depth of the sample.

In addition, we can refer to the sequencing depth of the same type of research (such as the common depth of IR-seq of PBMC samples in public literature) and adjust the experimental budget to balance the data quality and cost.

Chord diagrams illustrate the pairing of V and J segments in both (a) productive and (b) nonproductive IgM sequences derived from one healthy individual (Hoehn et al., 2016)Chord diagrams showing the pairing of V and J segments within (a) productive and (b) nonproductive IgM sequences from a single healthy individual (Hoehn et al., 2016)

Conclusion

The experimental design of immuno-library sequencing is a systematic process. Sample acquisition and preparation need to ensure the quality of nucleic acid. BCR and TCR sequencing need to be differentiated based on receptor characteristics. Primers and library construction need to give consideration to specificity and integrity. The sequencing depth needs to match the research objectives to achieve accurate coverage and controllable cost.

At present, there are still challenges in the experiment of immune repertoire sequencing, such as the repeatability of results caused by sample heterogeneity, the sensitivity limitation of low-abundance clone detection, and the standardization differences of different platform library construction methods.

In the future, it is necessary to optimize the sample pretreatment process, develop a more efficient primer design algorithm, and establish a unified library construction and sequencing depth standard to further improve the reliability and comparability of immune repertoire sequencing data. These technical optimizations will promote the transformation of IR-seq from basic research to the clinic, and provide more powerful technical support for disease diagnosis, treatment monitoring, and immune mechanism analysis.

References

  1. Vasighizaker A, Danda S, Rueda L. "Discovering cell types using manifold learning and enhanced visualization of single-cell RNA-Seq data." Sci Rep. 2022 12(1): 120.
  2. Aran A, Garrigós L, Curigliano G, Cortés J, Martí M. "Evaluation of the TCR Repertoire as a Predictive and Prognostic Biomarker in Cancer: Diversity or Clonality?" Cancers (Basel). 2022 14(7): 1771.
  3. Bashford-Rogers RJ, Palser AL, Idris SF, et al. "Capturing needles in haystacks: a comparison of B-cell receptor sequencing methods." BMC Immunol. 2014 15: 29.
  4. Xu Z, Ismanto HS, Zhou H, Saputri DS, Sugihara F, Standley DM. "Advances in antibody discovery from human BCR repertoires." Front Bioinform. 2022 2: 1044975.
  5. Hoehn KB, Fowler A, Lunter G, Pybus OG. "The Diversity and Molecular Evolution of B-Cell Receptors during Infection." Mol Biol Evol. 2016 33(5): 1147-1157.
For research purposes only, not intended for clinical diagnosis, treatment, or individual health assessments.


Inquiry
For research purposes only, not intended for clinical diagnosis, treatment, or individual health assessments.

CD Genomics is transforming biomedical potential into precision insights through seamless sequencing and advanced bioinformatics.

Copyright © CD Genomics. All Rights Reserved.
Top